An Introduction to Application-Independent Evaluation of Speaker Recognition Systems
نویسندگان
چکیده
In the evaluation of speaker recognition systems, the tradeoff between missed speakers and false alarms has always been an important diagnostic tool. NIST has defined the task of speaker detection with the associated Detection Cost Function (DCF) to evaluate performance, and introduced the DET-plot [1] as a diagnostic tool. Since the first evaluation in 1996, these evaluation tools have been embraced by the research community. Although it is an excellent measure, the DCF has the limitation that it has parameters that imply a particular application of the speaker detection technology. In this chapter we introduce an evaluation measure that instead averages detection performance over application types. This metric, Cllr, was first introduced in 2004 by one of the authors [2]. Here we introduce the subject with a minimum of mathematical detail, concentrating on the various interpretations of Cllr and its practical application. We will emphasize the difference between discrimination abilities of a speaker detector (‘the position/shape of the DET-curve’), and the calibration of the detector (‘how well was the threshold set’). If speaker detectors can be built to output well-calibrated log-likelihood-ratio scores, such detectors can be said to have an application-independent calibration. The proposed metric Cllr can properly evaluate the discrimination abilities of the log-likelihood-ratio scores, as well as the quality of the calibration.
منابع مشابه
Speaker Independent Speech Recognition Using Hidden Markov Models for Persian Isolated Words
متن کامل
Speaker Independent Speech Recognition Using Hidden Markov Models for Persian Isolated Words
متن کامل
Field Evaluation of Text-Dependent Speaker Recognition in an Access Control Application
Vector quantization (VQ) is a widely used matching algorithm for text-independent speaker recognition. In this paper we study the use of text-dependent speaker recognition in practical access control application. We compared dynamic time warping (DTW) to VQ-based matching using textdependent pass phrases. Our goal was to find out, how fixed phrase affects speaker recognition performance. We col...
متن کاملOdyssey text independent evaluation data
We discuss the text-independent data supplied for the 2001: A Speaker Odyssey evaluation track. We cover the data creation and selection process, and we present results restricted to the Odyssey test set for participating systems in the 2000 NIST Speaker Recognition Evaluation.
متن کاملThe use of speaker correlation information for automatic speech recognition
This dissertation addresses the independence of observations assumption which is typically made by today’s automatic speech recognition systems. This assumption ignores within-speaker correlations which are known to exist. The assumption clearly damages the recognition ability of standard speaker independent systems, as can seen by the severe drop in performance exhibited by systems between the...
متن کامل